21 research outputs found

    Tracking English and Translated Arabic News using GHSOM

    Get PDF

    Subword Recognition in Historical Arabic Documents using C-GRUs

    Get PDF
    The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users’ direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords

    An Early Warning Tool for Predicting Mortality Risk of COVID-19 Patients Using Machine Learning

    Get PDF
    COVID-19 pandemic has created an extreme pressure on the global healthcare services. Fast, reliable, and early clinical assessment of the severity of the disease can help in allocating and prioritizing resources to reduce mortality. In order to study the important blood biomarkers for predicting disease mortality, a retrospective study was conducted on a dataset made public by Yan et al. in [1] of 375 COVID-19 positive patients admitted to Tongji Hospital (China) from January 10 to February 18, 2020. Demographic and clinical characteristics and patient outcomes were investigated using machine learning tools to identify key biomarkers to predict the mortality of individual patient. A nomogram was developed for predicting the mortality risk among COVID-19 patients. Lactate dehydrogenase, neutrophils (%), lymphocyte (%), high-sensitivity C-reactive protein, and age (LNLCA)—acquired at hospital admission—were identified as key predictors of death by multi-tree XGBoost model. The area under curve (AUC) of the nomogram for the derivation and validation cohort were 0.961 and 0.991, respectively. An integrated score (LNLCA) was calculated with the corresponding death probability. COVID-19 patients were divided into three subgroups: low-, moderate-, and high-risk groups using LNLCA cutoff values of 10.4 and 12.65 with the death probability less than 5%, 5–50%, and above 50%, respectively. The prognostic model, nomogram, and LNLCA score can help in early detection of high mortality risk of COVID-19 patients, which will help doctors to improve the management of patient stratification.Open access funding provided by the Qatar National Library. This publication was made possible by Qatar University Emergency Response Grant (QUERG-CENG-2020-1) from the Qatar University. The statements made herein are solely the responsibility of the authors

    The global, regional, and national burden of adult lip, oral, and pharyngeal cancer in 204 countries and territories:A systematic analysis for the Global Burden of Disease Study 2019

    Get PDF
    Importance Lip, oral, and pharyngeal cancers are important contributors to cancer burden worldwide, and a comprehensive evaluation of their burden globally, regionally, and nationally is crucial for effective policy planning.Objective To analyze the total and risk-attributable burden of lip and oral cavity cancer (LOC) and other pharyngeal cancer (OPC) for 204 countries and territories and by Socio-demographic Index (SDI) using 2019 Global Burden of Diseases, Injuries, and Risk Factors (GBD) Study estimates.Evidence Review The incidence, mortality, and disability-adjusted life years (DALYs) due to LOC and OPC from 1990 to 2019 were estimated using GBD 2019 methods. The GBD 2019 comparative risk assessment framework was used to estimate the proportion of deaths and DALYs for LOC and OPC attributable to smoking, tobacco, and alcohol consumption in 2019.Findings In 2019, 370 000 (95% uncertainty interval [UI], 338 000-401 000) cases and 199 000 (95% UI, 181 000-217 000) deaths for LOC and 167 000 (95% UI, 153 000-180 000) cases and 114 000 (95% UI, 103 000-126 000) deaths for OPC were estimated to occur globally, contributing 5.5 million (95% UI, 5.0-6.0 million) and 3.2 million (95% UI, 2.9-3.6 million) DALYs, respectively. From 1990 to 2019, low-middle and low SDI regions consistently showed the highest age-standardized mortality rates due to LOC and OPC, while the high SDI strata exhibited age-standardized incidence rates decreasing for LOC and increasing for OPC. Globally in 2019, smoking had the greatest contribution to risk-attributable OPC deaths for both sexes (55.8% [95% UI, 49.2%-62.0%] of all OPC deaths in male individuals and 17.4% [95% UI, 13.8%-21.2%] of all OPC deaths in female individuals). Smoking and alcohol both contributed to substantial LOC deaths globally among male individuals (42.3% [95% UI, 35.2%-48.6%] and 40.2% [95% UI, 33.3%-46.8%] of all risk-attributable cancer deaths, respectively), while chewing tobacco contributed to the greatest attributable LOC deaths among female individuals (27.6% [95% UI, 21.5%-33.8%]), driven by high risk-attributable burden in South and Southeast Asia.Conclusions and Relevance In this systematic analysis, disparities in LOC and OPC burden existed across the SDI spectrum, and a considerable percentage of burden was attributable to tobacco and alcohol use. These estimates can contribute to an understanding of the distribution and disparities in LOC and OPC burden globally and support cancer control planning efforts

    Global, regional, and national burden of stroke and its risk factors, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019

    Get PDF
    Background Regularly updated data on stroke and its pathological types, including data on their incidence, prevalence, mortality, disability, risk factors, and epidemiological trends, are important for evidence-based stroke care planning and resource allocation. The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) aims to provide a standardised and comprehensive measurement of these metrics at global, regional, and national levels. Methods We applied GBD 2019 analytical tools to calculate stroke incidence, prevalence, mortality, disability-adjusted life-years (DALYs), and the population attributable fraction (PAF) of DALYs (with corresponding 95% uncertainty intervals [UIs]) associated with 19 risk factors, for 204 countries and territories from 1990 to 2019. These estimates were provided for ischaemic stroke, intracerebral haemorrhage, subarachnoid haemorrhage, and all strokes combined, and stratified by sex, age group, and World Bank country income level. Findings In 2019, there were 12·2 million (95% UI 11·0–13·6) incident cases of stroke, 101 million (93·2–111) prevalent cases of stroke, 143 million (133–153) DALYs due to stroke, and 6·55 million (6·00–7·02) deaths from stroke. Globally, stroke remained the second-leading cause of death (11·6% [10·8–12·2] of total deaths) and the third-leading cause of death and disability combined (5·7% [5·1–6·2] of total DALYs) in 2019. From 1990 to 2019, the absolute number of incident strokes increased by 70·0% (67·0–73·0), prevalent strokes increased by 85·0% (83·0–88·0), deaths from stroke increased by 43·0% (31·0–55·0), and DALYs due to stroke increased by 32·0% (22·0–42·0). During the same period, age-standardised rates of stroke incidence decreased by 17·0% (15·0–18·0), mortality decreased by 36·0% (31·0–42·0), prevalence decreased by 6·0% (5·0–7·0), and DALYs decreased by 36·0% (31·0–42·0). However, among people younger than 70 years, prevalence rates increased by 22·0% (21·0–24·0) and incidence rates increased by 15·0% (12·0–18·0). In 2019, the age-standardised stroke-related mortality rate was 3·6 (3·5–3·8) times higher in the World Bank low-income group than in the World Bank high-income group, and the age-standardised stroke-related DALY rate was 3·7 (3·5–3·9) times higher in the low-income group than the high-income group. Ischaemic stroke constituted 62·4% of all incident strokes in 2019 (7·63 million [6·57–8·96]), while intracerebral haemorrhage constituted 27·9% (3·41 million [2·97–3·91]) and subarachnoid haemorrhage constituted 9·7% (1·18 million [1·01–1·39]). In 2019, the five leading risk factors for stroke were high systolic blood pressure (contributing to 79·6 million [67·7–90·8] DALYs or 55·5% [48·2–62·0] of total stroke DALYs), high body-mass index (34·9 million [22·3–48·6] DALYs or 24·3% [15·7–33·2]), high fasting plasma glucose (28·9 million [19·8–41·5] DALYs or 20·2% [13·8–29·1]), ambient particulate matter pollution (28·7 million [23·4–33·4] DALYs or 20·1% [16·6–23·0]), and smoking (25·3 million [22·6–28·2] DALYs or 17·6% [16·4–19·0]). Interpretation The annual number of strokes and deaths due to stroke increased substantially from 1990 to 2019, despite substantial reductions in age-standardised rates, particularly among people older than 70 years. The highest age-standardised stroke-related mortality and DALY rates were in the World Bank low-income group. The fastest-growing risk factor for stroke between 1990 and 2019 was high body-mass index. Without urgent implementation of effective primary prevention strategies, the stroke burden will probably continue to grow across the world, particularly in low-income countries.publishedVersio

    Global, regional, and national burden of stroke and its risk factors, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019

    Get PDF
    Background Regularly updated data on stroke and its pathological types, including data on their incidence, prevalence, mortality, disability, risk factors, and epidemiological trends, are important for evidence-based stroke care planning and resource allocation. The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) aims to provide a standardised and comprehensive measurement of these metrics at global, regional, and national levels. Methods We applied GBD 2019 analytical tools to calculate stroke incidence, prevalence, mortality, disability-adjusted life-years (DALYs), and the population attributable fraction (PAF) of DALYs (with corresponding 95% uncertainty intervals UIs]) associated with 19 risk factors, for 204 countries and territories from 1990 to 2019. These estimates were provided for ischaemic stroke, intracerebral haemorrhage, subarachnoid haemorrhage, and all strokes combined, and stratified by sex, age group, and World Bank country income level. Findings In 2019, there were 12.2 million (95% UI 11.0-13.6) incident cases of stroke, 101 million (93.2-111) prevalent cases of stroke, 143 million (133-153) DALYs due to stroke, and 6.55 million (6.00-7.02) deaths from stroke. Globally, stroke remained the second-leading cause of death (11.6% 10.8-12.2] of total deaths) and the third-leading cause of death and disability combined (5.7% 5.1-6.2] of total DALYs) in 2019. From 1990 to 2019, the absolute number of incident strokes increased by 70.0% (67.0-73.0), prevalent strokes increased by 85.0% (83.0-88.0), deaths from stroke increased by 43.0% (31.0-55.0), and DALYs due to stroke increased by 32.0% (22.0-42.0). During the same period, age-standardised rates of stroke incidence decreased by 17.0% (15.0-18.0), mortality decreased by 36.0% (31.0-42.0), prevalence decreased by 6.0% (5.0-7.0), and DALYs decreased by 36.0% (31.0-42.0). However, among people younger than 70 years, prevalence rates increased by 22.0% (21.0-24.0) and incidence rates increased by 15.0% (12.0-18.0). In 2019, the age-standardised stroke-related mortality rate was 3.6 (3.5-3.8) times higher in the World Bank low-income group than in the World Bank high-income group, and the age-standardised stroke-related DALY rate was 3.7 (3.5-3.9) times higher in the low-income group than the high-income group. Ischaemic stroke constituted 62.4% of all incident strokes in 2019 (7.63 million 6.57-8.96]), while intracerebral haemorrhage constituted 27.9% (3.41 million 2.97-3.91]) and subarachnoid haemorrhage constituted 9.7% (1.18 million 1.01-1.39]). In 2019, the five leading risk factors for stroke were high systolic blood pressure (contributing to 79.6 million 67.7-90.8] DALYs or 55.5% 48.2-62.0] of total stroke DALYs), high body-mass index (34.9 million 22.3-48.6] DALYs or 24.3% 15.7-33.2]), high fasting plasma glucose (28.9 million 19.8-41.5] DALYs or 20.2% 13.8-29.1]), ambient particulate matter pollution (28.7 million 23.4-33.4] DALYs or 20.1% 16.6-23.0]), and smoking (25.3 million 22.6-28.2] DALYs or 17.6% 16.4-19.0]). Interpretation The annual number of strokes and deaths due to stroke increased substantially from 1990 to 2019, despite substantial reductions in age-standardised rates, particularly among people older than 70 years. The highest age-standardised stroke-related mortality and DALY rates were in the World Bank low-income group. The fastest-growing risk factor for stroke between 1990 and 2019 was high body-mass index. Without urgent implementation of effective primary prevention strategies, the stroke burden will probably continue to grow across the world, particularly in low-income countries

    HANDWRITING RECOGNITION IN ARABIC HISTORICAL MANUSCRIPTS

    No full text
    Document Analysis and Recognition significantly impact humanitarian studies by revealing information hidden in historical document collections worldwide. This research area merges the sciences of computer vision and machine learning. This PhD dissertation aims at recognizing text in Arabic historical handwritten documents by learning and extracting visual representations inside these manuscripts. The proposed approaches presented in this dissertation have the primary purpose of creating effective systems to deal with challenges linked to Arabic handwriting recognition, particularly in ancient manuscripts with old handwriting. The use of Convolutional Neural Networks (CNNs) to tackle the Arabic handwriting recognition challenges is an integral part of this dissertation. Several architectures for developing high-performing features are suggested. A model based on CNN and Gated Recurrent Units (GRUs) is used to recognize a wide range of handwritten Arabic subwords extracted from historical documents. Because recent research has shown that typical CNNs' learning performance is limited as they are homogeneous networks with a simple (linear) neuron model, a further improvement in the handwriting recognition models using non-linear neuron models is implemented. Operational Neural Networks (ONNs) are recently proposed as heterogeneous networks with a non-linear neuron model. Even with compact architectures, they can learn highly complex and multi-modal functions. This PhD dissertation investigates the use of Self-Organized Operational Neural Networks (SelfONNs) for handwriting recognition and the generalization capabilities of non-linear neuron models, i.e., if deep discriminative features can be created. An investigation of an adequate level of non-linearity of the Self-ONN layers to provide extensive information on the Self-ONN performance under various topologies is presented. With such a novel approach, superior performance is achieved on a historical Arabic dataset and state-of-the-art performance is gained with a significant performance gap overall recent methods on an English dataset. Furthermore, a novel method for disambiguating undotted Arabic characters is presented. While the method is useful for handwriting recognition systems dealing with Arabic manuscripts with ancient undotted letters, it also improved the visual recognition performance on current Arabic handwritten documents with dotted and diacritized characters

    Finding English and translated Arabic documents similarities using GHSOM

    No full text
    The idea of finding similar news across Arabic and English sources is that to provide the audience with multiple views of the broadcasted news because reading the news from a single source may not always reflects on what happening around the world due different background, cultures and opinions of the readers and writers. To achieve this goal there are many techniques have been used to cluster the documents with similar themes. In this paper, we analyze the similarity of the views on the news written in the news translations form Arabic and English texts using Self-organizing Map (SOM). However, we have found there are some difficulties in SOM that affect its performance. In order to improve the problems of performance, we have used a Growing Hierarchical Self-organizing Map (GHSOM). The main advantage of such a mapping is the ease by which a user gains an idea regarding the structure of the data by analyzing the map. Thousands of news documents have been collected from Arabic and English news sources from the web in order to train both algorithms. Form experiments, the results show that using GHSOM is better in terms of clustering documents with the same opinions
    corecore